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Abstract. We express the asymptotics of the remainders of the partial sums 
of the generalized hypergeometric function q+iFq ^ ^'^ ' | through 

an inverse power series z"n^ where the exponent A and the asymptotic 

coefficients {cf^} may be recursively computed to any desired order from the 
hypergeometric parameters and argument. From this we derive a new series 
acceleration technique that can be applied to any such function, even with 
complex parameters and at the branch point 2 = 1. For moderate parameters 
(up to approximately ten) a C implementation at fixed precision is very effec- 
tive at computing these functions; for larger parameters an implementation in 
higher than machine precision would be needed. Even for larger parameters, 
however, our C implementation is able to correctly determine whether or not 
it has converged; and when it converges, its estimate of its error is accurate. 



1. Introduction 

The generalized hypergeometric function pFg( ^J'''''^^ | z) is ubiquitous in ap- 
pHed mathematics; a wide array of special functions are particular cases of this 
function. Hence the numerical evaluation of this function is an important problem. 
In many instances, specialized methods for a particular special function are the 
most computationally efficient, but there are still situations where the generalized 
hypergeometric function must be evaluated for computationally challenging choices 
of parameters and argument. In this paper we present a new algorithm that is 
able to evaluate many of those challenging cases, and we describe its numerical 
implementation and testing. 

1.1. Analytic properties of the generalized hypergeometric function. To 

understand the difficulties, we briefly recall the definition and analytic properties of 
the generalized hypergeometric function; for more details see the reference site the 
Digital Library of Mathematical Functions [ ] or its print version [_], the monograph 
of Slater [.;] , or the text of Graham et al. [4] . Such functions are characterized by 
a set of upper and lower parameters and a single argument; in the most general 
case the argument and any of the parameters may be complex. The function can 
be defined through a Taylor series about the origin, where that converges: 



(1-1) pF, 



ai, . . . , Qfp 
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Here {a)k ~ a{a + 1) • • • (a + fc — 1) is the Pochhammer symbol. The function 2F1 
is sometimes called simply the hypergeometric function or Gauss' hypergeometric 
function; the generalized functions are then all of those with other numbers of 
parameters. In this paper we will refer to any function given by (1.1) as a generalized 
hypergeometric function, or sometimes simply a hypergeometric function. 

As a function of the complex argument z, the analytic characteristics of pFq 
depend on the relative number of upper and lower parameters. When p < q, the 
function is entire and so the power series converges everywhere. When p — q + 1 — 
the case we will we concerned with in this paper — the series converges for z < 1. 
On the unit circle the behavior of the series (1.1) with p = q + 1 depends crucially 
on the real part of the parameter a, defined as: 



Then the series converges at z = 1 if dia < 0, and it converges elsewhere on the 
unit circle if Jftcr < 1 . 

Outside of the unit circle. q+iFq may be defined through analytic continuation. 
This may be effected through the defining differential equation satisfied by the 
generalized hypergeometric function (see [1]); that equation is singular at z = 1 and 
the function has a branch point there (logarithmic if cr S Z; algebraic otherwise). 
The branch cut is conventionally taken along the positive real axis from z = 1 to 
infinity. 

For the Gaussian hypergeometric function 2F1 we can in fact evaluate the func- 
tion at the branch point in closed form, thanks to Gauss' formula 



that is valid whenever 3fJ(/3i — ai — > 0. This formula will prove very useful 
later for testing our method. 

Finally, though we will not be concerned with the case p > q + 1 m this paper, 
we do note that in this case the radius of convergence of the series is zero. 

With this background, consider now the numerical computation of q^iFg. The 
advice given in the standard reference Numerical Recipes [•')] is to simply use the 
series (1.1) directly when |z| ^ 1, and to integrate the defining differential equa- 
tion of the generalized hypergeometric function elsewhere, of course taking care 
not to cross the branch cut, and avoiding the neighborhood of the branch point. 
This leaves open, however, the question of how to evaluate the function at or near 
the branch point, where the differential equation is singular and the series slowly 
convergent. 

1.2. Summary of results. We address that problem in this paper through a novel 
series acceleration technique. For any qj^iFq, we will show that the sequence {s„} 
of partial sums: 



(1.2) 




(1.3) 





n-l 



(1.4) 
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satisfies an asymptotic expansion of the form: 

oo 

(1.5) Sn ^ S + U Z^n'''}^ —r, 71 — > CX), 

Z — / yjK 

k=0 

for undetermined constants s and /i but for asymptotic coefficients that can 
be computed to any desired order recursively from the hypergeometric parameters 
ai, . . . , ag+i, Pi,...,Pq and the complex argument z. The exponent A will be 
shown to be: 



(1.6) 




when z — 1 
(T — 1 otherwise 



with a defined as in (1.2). 

Using the asymptotic expansion (1.5), we may use any two successive computed 
partial sums s„ and s„+i to estimate the undetermined coefficients s and /i; the 
estimate for s then becomes the accelerated estimate for the sum and hence the 
function /j'j^..'.^^^^ | itself. We describe in some detail a numerical imple- 

mentation of this technique, which estimates both truncation and ffoating point 
errors to determine either that the algorithm has converged to the user specified 
tolerance, or that convergence is impossible at standard machine precision. Even in 
the latter case, however, the algorithm itself is still capable of accelerating conver- 
gence; it just must be implemented at higher than machine precision. We present 
tests of this algorithm to show that it is robustly able to either accelerate con- 
vergence (in many cases dramatically) or correctly conclude that a higher working 
precision is needed. The algorithm is particularly effective at the branch point 
z = l. 

There are two key features of this work that deserve highlighting: 

(1) Our emphasis is on a robust algorithm that can handle complex parameters 
and argument. We want it to succeed as often as possible — without user in- 
tervention to determine convergence — and to reliably indicate failure when 
it has not converged. Considerable effort has therefore been spent in de- 
signing error estimation and stopping criteria, and the tests summarized in 
section 3.2 are designed to thoroughly probe how well the algorithm meets 
these criteria. 

(2) The key novelty of the algorithm is its analytic calculation of the remainder 
asymptotics. This is possible only because we have an analytic expansion 
of the term ratio in inverse powers. Thus, this algorithm requires detailed 
analytic knowledge about the series accelerated. This is a strength of the 
method in that we might hope (and in fact will see) that specific ana- 
lytic knowledge about our series allows our method to succeed where other 
methods fail. But it is also a limitation, since there is no obvious way to 
generalize the method to series that are only known numerically (certainly 
a very important class). Nevertheless, even within this limitation we believe 
that there is more to be explored, as many functions and series beyond the 
q+iFq functions we consider here may be amenable to this approach. 

1.3. Previous work. The literature on computing the generalized hypergeometric 
function depends on the restrictions placed on the number and values of the pa- 
rameters. For real parameters and argument to 2^1 ( "/j"^ | 2)1 it is often efficient 
to piece together different approximations based on the values of the parameters 
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and the argument; this is the approach taken, for example, by the popular GNU 
Scientific Library [ ]. A more detailed analysis of the algorithms appropriate for 
different (real) parameters may be found in the work of MuUer [~\ for the particular 
case of the confluent hypergeometric function iFi. Pearson's master's thesis [ ] con- 
siders both the confluent hypergeometric function and 2F1, and moreover considers 
complex parameters and argument. The paper [ ] of Forrey describes software that 
uses functional transformations and difference equations to evaluate the Gaussian 
hypergeometric function for arbitrary real argument, and Becken and Schmelcher 
[10] consider analytic continuation formulae to again customize the computation 
based on the range of the argument. Chatterjee and Roy [., .■] consider a modifica- 
tion of standard Levin-type methods tailored to the hypergeometric function, and 
Gautschi [ ] considers evaluation of both Gaussian and confluent hypergeometric 
functions for complex arguments, but real parameters, using Gaussian quadrature 
to evaluate integral representations of the functions. Weniger looked at using tra- 
ditional series acceleration methods but irregular input data in [ ] and considered 
divergent hypergeometric series at 2; = — 1 using a method tailored to alternating 
series in [1 I]. Finally Kalmykov [L5] and Kalmykov et al. [id] consider an expansion 
of the Gaussian function near integer values of its parameters. 

For complex parameters and argument for the generalized hypergeometric func- 
tion q+iFq(^ I z), the literature is much more sparse. Skorokhodov con- 
siders analytic continuation via symbolic manipulations for the generalized hyper- 
geometric function in the neighborhood of z = 1 in [17, 18]. Ferreira et al. [10] 
consider an expansion valid for larger lower parameters of the Gaussian function. 
Aside from the recommendation of ] ] already given above, Perger et al. ] ] im- 
plement the defining series (1.1) directly, in higher than machine precision. While 
this can protect against some instances of floating point error, it does not speed 
the convergence of the series itself. The most sophisticated software package to 
handle the generalized hypergeometric function that the present author has en- 
countered is Johansson's mpmath Python module ] ]. That package uses a mix 
of direct series calculation and analytic continuation, as well as the Shanks' series 
acceleration method near the unit circle but away from z = 1. Near the branch 
point Euler-Maclaurin summation is used. 

Series acceleration has a long history; we will give a brief review and more point- 
ers to the literature in section 2.1. Here we only mention a few recent techniques 
that have been specifically applied to hypergeometric functions. Wozny and Nowak 
]__] and Wozny ['' '] consider a new series acceleration technique that they ap- 
ply (among other examples) to some instances of the generalized hypergeometric 
function. Their approach is based on finding certain difference operators that ap- 
proximately annihilate the remainder term of the series. Paszkowski ] ] considers 
several acceleration algorithms and how they may be modified if an asymptotic 
form for the partial sums is known. While it is mentioned that the generalized 
hypergeometric functions belong to this class, an explicit expression for the asymp- 
totic coefficients is not given, and only a few low order expansions are considered 
in examples. Likewise Lewanowicz and Paszkowski [25] consider an acceleration 
method based on the asymptotic expansion of the term ratio, as we will, but those 
authors do not apply it to the asymptotics of the partial sum, and they are only 
able to accelerate certain parameter choices for 3F2 functions at z = ±1. In Sko- 
rokhodov ['"-] and Bogolubsky and Skorokhodov [''"] the authors use an asymptotic 
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expansion of the terms — rather than the term ratios — to calculate an approximant 
to the truncation error that they evaluate using Hurwitz zeta functions. We shall 
compare the performance of our method to theirs in section 4.1. 

Perhaps closest in spirit to the present work is that of Weniger [ ]. which is 
based on finding asymptotic approximations to the remainder terms of a partial 
series summation through symbolic linear algebra. That method can in some cases 
yield analytic expressions, as for the Dirichlet series of the Riemann zeta function 
(section 5 of [28]) or the divergent Euler series for the exponential integral (as in 
Borghi [-■)])■ However, the method presented in [ ] is challenging even for 2F1, 
particularly as compared to the complete asymptotic series that we will give in this 
paper. 

Finally, we mention that Biihring has considered both the behavior of the gen- 
eralized hypergeometric functions near unity [ ' ], seeking to find the analogue of 
Gauss' formula (1.3) for higher order functions; and separately considered the as- 
ymptotic behavior of the partial sums of generalized hypergeometric functions at 
unity [ ]. In each case, the expressions derived are nested infinite sums of hy- 
pergeometric functions of lower order, so the results do not seem well adapted to 
numeric computation. More practically useful for us is his work in [-S^, -).'>] and the 
earlier work of N0rlund [ ], which give expansions valid near the branch point. 
They can conceivably be leveraged to take an efficient evaluation method at the 
branch point and evaluate a hypergeometric function near the branch point using 
expansions valid in a neighborhood of the branch point. 



The well-known Euler 's method shows that series acceleration dates to at least 
the eighteenth century, and both Knopp [ ] and Tweddle [ ] cite Stirling as the 
earliest to develop a series acceleration method, but the last several decades have 
seen the development of a variety of sophisticated techniques. For reviews, see the 
articles of Brezinski [-HT], Homeier [38], and Weniger [39], and the monographs of 
Brezinski [40, 41], Brezinski and Redivo Zaglia ] '''], Sidi [-t:^], Walz ] ] and Wimp 
[41]. We will give just enough background in the next section to place our new 
method in context. 

2.1. Review of series acceleration methods. The basic idea of any series ac- 
celeration technique is to use the expected form of the partial sums {s„} of a 
series — which by themselves may be slowly convergent or even divergent — to cre- 
ate a new sequence {s^} that converges to the same limit s (or, in the case of a 
divergent series, antilimit), but that does so more rapidly, in the sense that: 



To motivate these transformations, we start from the explicit expression of the 
sequence {sn} of partial sums by means of the terms {tk} of the series: 



2. Accelerating the convergence of the series 



(2.1) 




0. 



n-l 



(2.2) 




k=0 
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Now partition the partial sum into its (anti-)limit s and the remainder pn'. 

oo 

(2.3) p„ := - ^ tk 

k—n 

SO that: 

(2.4) s„ = s + p„. 

Various algorithms can then be devised by approximating the remainder as p„ w 
w„/i„, where aj„ is an explicit remainder estimate, and /i„ is an 0(1) correction 
factor containing m free parameters. The order of the transformation is m. 

For instance, if the terms of the series are alternating, then a natural estimate 
of the remainder would be the first term not included, i„. If we choose a;„ — i„ 
and choose: 

m— 1 

for some positive [3, and undetermined coefficients c^, then from any sequence of 
m + 1 successive partial sums, we may determine values for the m coefficients Ck 
and the limit s. This choice of remainder estimate a;„ and correction factor /u^ 
gives rise to Smith and Ford's modification [4()] of Levin's t transformation [47]; 
other choices for either w„ or fi„ give other sequence transformations; for details 
see the review articles and monographs mentioned earlier. The actual estimation 
of s from the to + 1 partial sums can be expressed as a ratio of determinants, but 
is more commonly implemented recursively .■'!!]. 

Among all of these methods, we will single out one known as the i?-method 
[42, 48-50], because it largely subsumes all of the others as special cases: starting 
from a set of functions {(?^(f^)}"4l, one requires: 

m 

(2.6) Sn = s + y^^c^g^{n). 

i=Q 

From the to + 1 sums s„, s„+i, . . . , Sn+m we can calculate s and the c^; the value 
of s is then the accelerated sum determined by the _E-method for that particular 
choice of functions gi and those partial sums. We will compare the performance of 
this method to that of this paper in section 4.3. 

The effectiveness of such algorithms depends on the nature of the series to be 
summed; for instance, we mentioned above that the t transformation was designed 
with alternating series in mind. One central characteristic of convergent series that 
inffuences the effectiveness of acceleration methods is whether that convergence is 
linear or logarithmic. The former means that if the limit of the sequence of partial 
sums is s, then 

(2.7) lim ^"+^ ~ = i 

n->oo Sn — S 

with < |£| < 1. On the other hand, ii i — 1, then the convergence is logarith- 
mic [.■)!]. 

The distinction is important because while several series acceleration methods 
can be shown to accelerate the convergence of any linearly convergent series ] ] , 
it is known from the work of Delhaye and Germain-Bonne ] ] that no method is 
able to accelerate the convergence of all logarithmically convergent series. Yet at 
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the branch point of the generaHzed hypergeometric function (as we will show in 
the next section) the convergence of the series is logarithmic. Section 14 of [39] 
contains a summary of methods that may be applied to logarithmically convergent 
series; while alternating series are often tractable, with a generic choice of complex 
parameters the terms in the generalized hypergeometric series are complex and 
exhibit no particular sign pattern (see Sidi ['i'-'] for examples of the effect of irregular 
sign patterns on series acceleration techniques). Moreover, some series acceleration 
techniques are quite sensitive to the details of the asymptotic form of the remainders 
[.')!)]; they are able to accelerate convergence when /?„ ~ n^*^ for an integer k, but fail 
for non-integral but real exponents. For complex exponents — as we will encounter 
for generalized hypergeometric sums — very little indeed seems to be known. 

Thus, while there is a wide array of sophisticated series acceleration methods 
able to speed the convergence, often substantially, of many series, and even to 
sum many strongly divergent series, there does not seem to be a method that 
is broadly applicable to the series expansion of the generalized hypergeometric 
function, particularly at its branch point. Even very new methods, such as those 
in [22, 23, 28], typically consider only real hypergeometric parameters. 

Instead, we propose a method that can be applied to any generalized hypergeo- 
metric of the form q^iFq. Unlike conventional series acceleration methods, instead 
of a simple choice for the remainder estimate aj„ and a complicated choice for the 
correction /i„ (so that several successive partial sums are needed to calculate each 
s'j) we will use a very precise remainder estimate but just a constant for our correc- 
tion factor. Thus, we will need only two successive partial sums to compute each 
estimate of the series limit. We turn now to the derivation of that method. 

2.2. Derivation of the partial sum asymptotics. By comparing the general 
form (2.2) to the expression (1.4) for the partial sums of the generalized hypergeo- 
metric and making use of the definition of the Pochhammer symbol, we may write 
the ratio of two successive terms of the generalized hypergeometric function as: 

^ ' h ""(/3i+fc)...(/3, + fc)(l + fc) ■^''^''>' 

that is, the ratio of two successive terms is a rational function of the index k. This 
property of generalized hypergeometric functions is very well known; indeed, any 
function whose term ratio is a rational function of the term index may be expressed 
in terms of generalized hypergeometric functions [4] . 

Because the ratio of terms satisfies a first-order recurrence relation, both the 
remainders {p„} and the partial sums {s„} satisfy a second-order recurrence; in 
fact, the two sequences satisfy the same second order recurrence. In equations, we 
have: 

(2.9) ^ = z r(n) = ^"+^ = - Pn+i 
and therefore: 

(2.10) Sn+2 ^ (l + ^ + 2r(n)s„ = 

(2.11) pn+2 - (l + zr{n)^pn+i + zr{n)pn = 

The key point is that the remainders and partial sums each satisfy a linear, 
homogeneous, second-order difference equation, and the asymptotic solutions of 



8 



JOSHUA L. WILLIS 



such equations are known very precisely. In principle, we could use either of equa- 
tion (2.10) or (2.11) as our starting point, and typically in series acceleration it 
would be more natural to focus on the behavior of the remainders. However, it is 
slightly more convenient to use (2.10) and base our results directly on the asymp- 
totics of the partial sums themselves. That is because there are two linearly in- 
dependent solutions to our difference equation, and we will find that one of those 
two solutions is always a constant, and the other approaches zero as n — >■ cx) when 
\z\ < 1, but diverges as n — > cx) when \z\ > 1. Were we to focus solely on the 
remainders, all we could say was that when |z| < 1 only the decreasing solution can 
be present, since we know the remainders go to zero in that case. By directing our 
attention instead to the partial sums, we see that there is always a constant term 
(the value of the function we are trying to find) as well as a remainder (the second 
solution) that diverges when we are outside the radius of convergence, and goes to 
zero inside the radius of convergence. Thus, we show directly that our asymptotic 
acceleration not only speeds the convergence of the series when it does converge, 
but also slows the divergence when it does not. 

So consider the general problem of the asymptotic solutions (valid for large n) 
of a linear, homogeneous, second-order difference equation that may be written in 
the form: 

(2.12) Wn+2 + a{n)wn+i + b{n)wn ^ 

where the coefficient functions a(n) and b{n) are themselves known, and have as- 
ymptotic expansions: 

C30 OO 7 

(2.13) «w-E^' ^w-E^' 

fc=0 k=0 

This problem has been considered by several authors, beginning with Birkhoff [ ', 
T)-!], Adams [ ], and Birkhoff and Trjintzinsky [ ]. More recently, this body of 
work has been reviewed and summarized by both Wimp and Zeilberger [^i7\ and 
Wong and Li [ ], who all find Birkoff's work notable for its complexity. For us, 
by far the most useful reference will be [ ] , since the authors carefully analyze the 
several possible cases that may arise when finding asymptotic solutions to (2.12), 
and also provide explicit, recursive formulas for arbitrary asymptotic coefficients. 
That will be crucial. We follow the terminology of [ ], but not in general the 
notation. 

The analysis of [^S] only considers the case in which bo of (2.13) is not zero. As 
we will soon see, that will limit the pFg that our acceleration can handle to those for 
which p = q + 1. In a later paper [59] the same authors consider the more general 
case, and so likewise we will consider the analysis of generalized hypergeometrics 
where p 7^ g -|- 1 in a separate work. 

With that restriction on 60, the authors of [oN] show that there are two lin- 
early independent asymptotic solutions of (2.12), which fall into one of three cases 
(depending on the lowest few asymptotic coefficients of a(n) and b{n)) as follows: 

The normal case.: When the two roots ^1 and ^2 to the characteristic equa- 
tion 



(2.14) 



+ ao C + ^0 = 
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are distinct, then the two hnearly independent solutions to (2.12) are each 
of the form: 



(2.15) a;„~fV^ 



fc=0 



for the two values of ^, and the exponent A for each solution depends on ^ 
through: 

(2.16) A + 



The subnormal case.: When the roots of the characteristic equation do 
coincide, but the double root is not the zero of the auxiliary equation 
ai <^ + 6i =0, then the two solutions are of the form: 



(2.17) ^±-^n-Ce^^^'n^J2^' " ^ 

k=0 



where 7 and A may be explicitly determined, but we will not need them. 
The exceptional case.: When the roots of the characteristic equation coin- 
cide and the double root is also the root of the auxiliary equation, then the 
two linearly independent solutions are again given by equation (2.15), but 
now the two values of A are given not by (2.16), but rather as the two roots 
of the indicial equation: 

(2.18) A(A-l)e2 + (aiA + a2)e + &2=0. 

There are some further complications considered in [ ] when the two roots 
of this equation either coincide or differ by a positive integer, but we will 
not need those subtleties. 

In each of these three cases, the leading coefficient cq of the asymptotic expansion 
may be taken, without loss of generality, to be one, and the higher coefficients are 
then determined recursively from the {a„} and through formulas that we will 
quote from [58] later. We will not discuss the derivation of these cases and the 
corresponding formulas, except to say that superficially the method is much like 
the series solution of differential equations: one proposes a form of the solution 
and inserts it into the equation, and this recursively determines all of the higher 
coefficients. But showing that the resulting formal solutions are indeed asymptotic 
is far from trivial, and for details the reader is referred to [58] and the rest of the 
literature cited. 

We see immediately that the recursion (2.10) satisfied by the partial sums s„ is 
of the form (2.12), provided we take: 

(2.19) a{n) -.^ ~{\ + zr{n)^ b{n) zr{n). 

To apply the results of we need an asymptotic expansion for r(n). But for 
large n, that is easy; because p = g + 1 we see from (2.8) that r{n) is a rational 
function whose numerator degree equals its denominator degree, and so we divide 
both numerator and denominator by n'"*"^ and write: 

^ ' ^ (l + /3ix)...(l + /3,x)(l + x) 
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where we have defined x as 1/n. This rational function (we deliberately use the same 
symbol) has a convergent Taylor series expansion in a neighborhood of zero, and 
the coefficients of that Taylor series will coincide with the asymptotic coefficients 
of r{n) as n — > oo. We will need arbitrarily many of these coefficients for our full 
acceleration method, and efficiently calculating those is not trivial, so we defer it 
to the next subsection. However, to determine which of the three cases above we 
fall under, we need only the lowest two, and elementary calculus yields: 

(2.21) ro = 1 

(2.22) ri=^afc-^/?fc-l = a-l. 



k=l k=0 



Finally, we will also need: 
(2.23) 



a,i = 



-l-z iin = Q 
—zrn otherwise 
(2.24) bn^zrn 



and the consequent identity a„ = —bn whenever n > 1. Note that bo precisely 
because we assume p = q + 1. 

Having considered the generalities, we now turn to precise formulas for the 
asymptotics of the {s„}. As those depend on whether or not we are at the branch 
point z = 1, we take up those two cases in turn. Before beginning that discussion, 
though, we point out that none of the results of the next two subsections apply when 
one of the upper or lower parameters is a non-positive integer. That is because in 
those situations the recursion (2.10) does not really describe the asymptotics of 
the partial sums as n — >■ oo; rather, in the first situation the hypergeometric is a 
terminating polynomial, and in the second, it is undefined. So although our results 
will not apply, either case is easy to identify and handle without series acceleration 
techniques. 

2.2.1. Partial sum asymptotics away from the branch point. For our recurrence 
(2.10), we have from (2.21-2.24) that the characteristic equation is: 

(2.25) C^-(l + z)C + z = 0, 

and the roots of this equation are 1 and z. Thus, provided we are not at the branch 
point of the hypergeometric function, the recurrence equation is in the normal case 
of the three listed above. We can also find the corresponding exponents from (2.16), 
and we easily determine that the exponent corresponding to ^ = 1 in fact vanishes, 
whereas the exponent corresponding to the root ^ = 2:isri=f7 — 1. 

To find the asymptotic coefficients for each of these cases, we need the recursion 
that those coefficients satisfy. That is given in equation (2.3) of ['''^], and in our 
notation is: 



fe-i 



(2.26) 



It is convenient to rewrite this equation so that it explicitly gives in terms of 
{co, . . . , Ck-i}- Rearranging terms, making use of equations (2.23) and (2.24), and 
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renaming the index of summation yields 
(2-27) c. = r7T^E 



k(l - z) 
^ ' j=o 



Cj, 



for the root ^ = 1 (and therefore A = 0), and 



k-l 



3=0 





1 




)(:: 


1 





k 



1 



for the root ^ = z (and X = a — 1). Each of these equations is valid for all A: > 1. 

In equation (2.28) we have expressed the recursion in terms of the asymptotic 
coefficients r„ of r(n), rather than the a„ and 6„, but we have not troubled to do 
that for the asymptotic coefficients for the = 1 root. That is because all of the 
coefficients Ck in equation (2.27) are zero when k > 1. To see this, first consider 
the coefficient of cq on the right hand side of (2.27). The binomial coefficients (j) 
for any integer I vanish, except for ([]) which equals one. Thus, since j < fc + 1, the 
first term in the square brackets vanishes because the binomial coefficient does, as 
do all of the terms in the inner sum over i, except for the i = j = term. That 
term survives to give ak+i, but that in turn cancels the term bk+i- Hence, the 
entire coefficient of cq vanishes. 

But by the recursion (2.27), that means that ci vanishes, and then by induction 
that all of the Ck vanish when fe > 0, for the ^ = 1 root of the characteristic 
equation. Hence, comparing to (2.15), we see that the first asymptotic solution to 
the difference equation (2.10) is simply a constant. That is not true for the second 
solution, however, and so since the partial sum s„ is a linear combination of these 
two asymptotic solutions, we have shown: 

oo 

(2.29) s„ - s + ^z"n'^-i^^, n^oo, 

k=0 

with a given by (1.2) and the Ck determined recursively from co = 1 by equa- 
tion (2.28). 

This result holds so long as z 7^ 1; in particular, it holds outside the radius of 
convergence \z\ = 1. In that case, however, (2.29) shows that the remainder term 
diverges as n 00; the series is not convergent there. It clearly converges whenever 
\z\ < 1, and on the circle of convergence the remainder is a decreasing function of 
n provided the real part of the exponent of n is negative; that is, provided 5Rcr < 1. 
So, our asymptotic formula correctly reproduces the analytic properties of q+iFq 
whenever z ^ 1; we next consider z = 1. 

2.2.2. Partial sum asymptotics at the branch point. When z = 1 we are no longer in 
the normal case of ["x^]. To decide between the subnormal and exceptional cases, we 
must examine the auxiliary equation ai^ + bi — 0. As ai = —bi and ^ = z = I, our 
double root is in fact a root of the auxiliary equation, so we are in the exceptional 
case. The indicial equation when ^ = 1 is: 

(2.30) A(A- 1) + (aiA + a2) + 62 = 0. 
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A = 



or 



X — ri + 1 — a 



as our two possible exponents. 

Just as for the normal case, we must now examine the recursive equations that 
determine the asymptotic coefficients c„ in terms of the {r„} for each of these 
two possible values of A. When we do so we again find that only the leading, 
constant term survives for A = 0, while for X — a the entire series is nontrivial. 
Similar considerations allow us to avoid some special sub-cases of the exceptional 
case alluded to above. Generically, the solutions to the recursion equation can have 
terms logarithmic in n when the difference between the two roots of the indicial 
equation are an integer, but for all of the cases where the hypergeometric series at 
the branch point converges (that is, where 3?cr < 0) we can show through inductive 
arguments similar to that above that the logarithmic term vanishes. Thus, the 
most general asymptotic solution is of the form: 



where again cq — I, and the higher coefficients can be shown from equation (7.2) 
of [58] to satisfy: 



As with regular points of the function, we see that the asymptotic expansion at 
the branch point reproduces the correct analytic behavior of the q+iFq function. 
Specifically, the remainder is a decreasing function precisely when 5Rct < 0, the 
condition we saw in section 1.1 was needed to ensure that the series converged at 
z — 1. We also see explicitly from (2.32) that the convergence of the series at z = 1 
is logarithmic, with a complex critical exponent (in general). 

2.3. Recursively computing the asymptotic coefRcients. The two asymp- 
totic expansions (2.29) and (2.32), together with the respective recursions (2.28) 
and (2.33), are a complete solution for the asyniptotics of the partial sums of the 
generalized hypergeometric function. To be numerically useful, however, we must 
be able to calculate arbitrary asymptotic coefficients rk of the rational term ratio 
function r(n). It is here that we can make use of the explicit analytic knowledge 
we have of our series; it comes to us not just as a numerical sequence of terms. As 
we noted, the are the same as the Taylor coefficients of the expansion around 
zero of the rational function r{x) of equation (2.20), but that in and of itself is not 
a practical solution, if we have no better means to calculate the Taylor coefficients 
than by evaluating high order derivatives at zero. 

Fortunately, there is a much more effective procedure. Define the functions P{x) 
and Q{x) as: 



oo 



(2.32) 




n — ^ oo, 



(2.33) 




(2.34) P{x) = (1+aix) • • • (l+aq+ix); 



Q{x) 



1 



{1 + (3ix) ■ ■ ■ {1 + (3qX){l + x) 
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SO that r{x) — P{x)Q{x). It is obvious that the Taylor coefficients of r{x) satisfy 

fe 

(2.35) rfc = ^F,Qfc_, 

when {Pk] and {Qfe} are the Taylor coefficients of the respective functions. 

The coefficients Pk and Qk can be calculated effectively because we have already 
factored the functions P{x) and Q{x). Specifically, rearranging some results of [GO] 
gives: 

9+1 

(2.36) P[x) = 

efc(ai, . . . ,aq+i)x 

k=0 

and: 

oo 

(2.37) Q(x) = 5](-l)'=/ife(l, /?!,..., /3g)a;'=. 

k=0 

In these equations, and are the elementary symmetric polynomials and the 
complete homogeneous symmetric polynomials, respectively. These may be defined 
in several ways, but the most computationally useful method defines them recur- 
sively from the power sums of the upper and lower parameters (where the lower 
parameters must be augmented with the implicit parameter one that is part of 
the definition of the generalized hypergeometric function). The k-th power sum 
Pk{xi, . . . , Xn) of any set {xi} of n numbers is: 

n 

(2.38) pfc(xi,...,x„) ^^xf. 

1=1 

In terms of the power sums, the elementary symmetric polynomials satisfy the 
recurrence: 

k 

(2.39) kckiai, . . . ,aq+i) = ^(-l)*"^efc_j(Q;i, . . . ,aq+i)pi{ai, . . . ,0^+1) 

i=l 

and the complete homogeneous symmetric polynomials satisfy the recurrence: 

(2.40) khkih /3l, ...,/?,)= ^ hk-^{l, . . . , Pq)p^{l, Pi,..-, Pq). 

i=l 

These recurrences do not determine eo or ho; both are unity. 

It is now straightforward to calculate the Taylor coefficients for any desired k. 
We first calculate the power sums pi of the upper and augmented lower parameters 
for i from zero to k. Then we use the recurrence (2.39) to find the elementary sym- 
metric polynomials of the upper parameters, and the recurrence (2.40) to find the 
complete homogeneous symmetric polynomials of the augmented lower parameters. 
Finally, combining equations (2.35-2.37) gives: 

k 

(2.41) rfe = ^(-1)^-V,(ai, . . . , aq+i)hk-,{l, Pi,..., Pq). 

i=0 
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Table 1. Stability results for recursive computation of the as- 
ymptotic coefficients at the branch point. All errors are geometric 
means across the sample of the maximum error among the m co- 
efficients. 





Rel. error in 


Rel. error in wiq 


R 


m = 30 m = 45 


TO = 30 TO = 45 


1 
5 

10 
50 
100 


3.9 X 10"" 5.5 X 10"" 
4.9 X 10-12 1.6 X 10-1" 
3.9 X 10-1^ 8.7 X 10-12 
1.9 X 10"" 4.1 X 10-" 
1.8x10"" 2.7x10"" 


2.6 X 10-1*^ 2.6 X lO"!** 
7.5 X 10-15 7.8 X 10"i5 
2.3x10"" 1.7xlO"i2 
1.9 X 10"" 4.1 X 10"" 
1.8x10"" 2.7x10"" 



2.4. Stability of the recursions. The expressions for the asymptotic coefficients 
(2.28) and (2.33), together with the formulas of the last subsection for calculat- 
ing the Taylor coefficients, provide a determination of the asymptotic coefficients 
through several nested recursions. It is therefore important to study the stability of 
these recursions. Given the nesting of the recursions, an analytic study is daunting, 
so we investigate the stability numerically. Even though most of our numeric results 
on the performance of our algorithm are presented in the next section, we digress 
briefly to study the stability of the coefficient calculation here. That is both because 
this study was quite different from the analysis of section 3, as it was conducted in 
much higher than machine precision, and also because we need an assurance of the 
stability of the asymptotic coefficient calculation before we can examine the overall 
method. 

Accordingly, we used the mpmath [21] package already mentioned in the in- 
troduction to implement the calculation of the coefficients in fixed but arbitrary 
precision. We compared the calculation at a precision of 53 bits (standard double 
precision, as we will use in the C implementation discussed in the next section) to 
that calculated with twice the precision, at 106 bits. We expect that the accuracy of 
the computation will depend on the size of the parameters, the value of z, and the 
number of coefficients calculated, since error presumably accumulates throughout 
the recursion. To investigate these effects, we studied both z = 1, and z chosen 
uniformly at random in the unit disk. We used four upper and three lower parame- 
ters, with real and imaginary parts each chosen uniformly in the range (— i?, R), for 
R e {1, 5, 10, 50, 100}. Unlike the actual implementation of the full algorithm itself 
that we study in the next section, we did not restrict ourselves to only cases where 
the hypergeometric series itself converges. To examine the effect of the number of 
coefficients, we considered both to = 30 and to = 45 coefficients. For each choice 
of m and R, we then chose 1000 sets of seven parameters (and also z if not testing 
the branch point) at random as described above. 

To quantify the results, for each coefficient we found the relative error between 
the value calculated in 53 bit precision and that in 106 bit precision. We then 
took the maximum of this relative error out of all to coefficients. To obtain some 
measure of central tendency among the thousand distinct random samples for a 
given m and R, we took the geometric mean of these thousand worst-case errors, 
since it is the order of magnitude of the result that is most important. The results 
of this calculation are shown in the second and third columns of table 1 for the 
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branch point cases (the unit disk cases are similar and are not shown) . We can see 
already that there is only a weak dependence of these results on to, indicating that 
nested recursions are in fact quite stable as we consider more and more asymptotic 
coefficients. Somewhat surprisingly, the average worst-case relative error is seen to 
be higher when R is smaller, even though as we will see in the next section it is larger 
values of R that make the overall computation of the generalized hypergeometric 
function more difficult. We might also be surprised that the typical worst-case errors 
can be as large as 10^^°, since we will see results in the next section that indicate 
we can typically calculate the hypergeometric function itself to higher accuracy in 
those ranges of R. 

This is because the relative errors in the asymptotic coefficients are not by 
themselves what is most relevant. More important is the error estimate w„ that we 
calculate from those coefficients. If our error is largest in the highest coefficients, 
then that error will be suppressed when we divide by n'^. Thus, in the fourth and 
fifth columns we show the relative error in wiq. Note that 10 is a very conservative 
number of terms to sum; for large parameter values we may typically see many tens 
or even hundreds of terms that must be summed to calculate the hypergeometric 
function. Yet we see that we are already very close to just a couple of orders of 
magnitude above machine precision for almost all values of R, and in particular 
the relative errors of 1 x 10^^^ and 2 x 10^^* that we shall use in the next section 
as relative tolerances to demand of q+iFg are reasonable. More importantly, the 
rather remarkable precision with which we can calculate w„ is the key reason that 
the method of this paper is much more stable than the £'-method, as we will see in 
section 4.3. 

3. Implementation and results 

We now have all of the pieces in place for an acceleration algorithm. To assemble 
them, we truncate either equation (2.32) or equation (2.29) for the asymptotics of 
our partial sums: 

(3.1) s„ = s + + 0(2"^^-™) 
where we have defined the truncated remainder estimate: 

m — 1 

(3.2) J-^ :. z"n^ ^. 

fc=0 

Here A is a—1 away from the branch point, and a at z — 1. Likewise, the asymptotic 
coefficients Ck are given by either (2.28) or (2.33), as appropriate. We call to 
the order of our method, and we use equation (2.41) to calculate the asymptotic 
coefficients regardless of whether or not z = 1. We will need m + 1 of the 
Tfc to calculate to coefficients cj.; though it might seem from equation (2.33) that 
we would need to + 2 of the when at the branch point, in fact the different 
appearances of rm+2 cancel. 

We emphasize that despite our detailed analytic knowledge of wi'"'' , it is not by 
itself a remainder estimate; we must also know the factor fi. That cannot be fixed 
by any of our analytic calculations; it must be estimated directly from the sequence 
of partial sums. This is a somewhat subtle point, but distinguishes our method 
from either Euler-Maclaurin summation [(il] or methods based on zeta- functions 
[26, 27, 62]. Those methods each compute an approximation to the remainder 
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directly, and need the partial sums only to combine with that approximation; our 
method requires two computed partial sums so that we may fix /i. It is in this 
respect more like fixed-order series acceleration methods, such as Shanks' method, 
though we continue to refer to m as the order for our method, as it is m that 
determines the rate of acceleration. 

From equation (3.1) and any two successive computed partial sums s„ and s„+i, 
we may calculate an accelerated suni Sn that is hopefully a better approximation 
to s than either s„ or s„_|_i: 

(m) (m) 
(-3 3-, ^(m) ^ ~ '^n+l^rt 

^ ' ' ^ (m) (m) 

This equation encapsulates our basic method; it is defined only for m > 0. 

Of course, since our estimates for the remainders are only asymptotic, it may 
require several computed partial sums before the asymptotic behavior of the re- 
mainder is reached; we expect this number to increase with increasing m, and also 
to depend on the parameters and argument of the hypergeometric. Even once we 
have reached the asymptotic regime for some particular value of m, we may need 
to continue calculating estimates for larger n to ensure that the 0(z"n'''^™) er- 
ror term is smaller than whatever tolerance is desired; again, we expect that the 
number of such additional iterations will depend on z and the hypergeometric pa- 
rameters. Nevertheless, the basic algorithm contained in equation (3.3) is ripe for 
an example, so consider the following evaluation, which we can do in closed form 
thanks to Gauss' formula (1.3): 
(3.4) 

1 + 4i, 1.5 -I- 4.5i ^ . ^ _o.003206491294324765 - 0.006293652031968077i. 



2F1 

We expect this to be challenging for any series summation technique: all of the 
parameters are complex and we evaluate the series at the branch point; the rate of 
decay of the remainders is only n""'^/^. 

As we can see in Figure 1, however, our method is indeed able to accelerate 
the convergence of this hypergeometric function dramatically. The topmost curve, 
which shows the relative error (5s„ = |1 — s„/s| with no acceleration, has no correct 
digits even after summing ten thousand terms, but the acceleration with 30 asymp- 
totic coefficients produces ten digit accuracy with roughly ten terms. This figure 
also shows clearly how the order m of the method affects the speed of convergence; 
the linear slopes of the Jsl™'' on a log-log scale illustrate the power law fall-off pre- 
dicted by equation (2.32). Finally, we see that the approach to convergence need 
not be monotonic: notice how 5sn jumps above the unaccelerated series before it 
becomes asymptotic. 

However, even a small change in the parameters can spoil this behavior. As a 
second example, consider the acceleration of 



(3.5) 2F1 



l + 20i,1.5 + 25i 
3 + 15i 



(-1.508618716765084 -I- 2. 168373234294654i) x 10 



-20 



We plot this in Figure 2 using 45 asymptotic coefficients. We do initially see very 
rapid decrease in the relative error of the accelerated sum. However, the relative 
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Figure 1. The relative error Ssn"''' for different orders m of accel- 
eration of the hypergeometric function of (3.4) . 
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Figure 2. Relative error in the partial sums and accelerated par- 
tial sums of equation (3.5). 
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error then levels off at about 10^^. This is despite the very small change made 
to the parameters: we have increased the imaginary parts of each parameter by 
roughly a single order of magnitude, and we have not changed the real parts at all. 

3.1. Numerical considerations. Of course there is really no mystery here: the 
largest magnitude that the partial sum in the evaluation of (3.5) reaches is roughly 
6.5 X lO^''; or thirty-seven orders of magnitude greater than the true value of the 
function. Certainly no method implemented in fixed IEEE 754 precision can ac- 
complish such an acceleration. 

Obviously we could simply implement the algorithm in a symbolic algebra pro- 
gram that can use arbitrary precision. However, in at least some cases this may not 
be desirable. For instance, the author's original interest in this project grew out of 
a problem in computational quantum gravity that requires the evaluation of thou- 
sands of 4_F'3(1) with complex parameters as part of a larger program. Even where 
higher precision is available it is useful to have our algorithm correctly determine 
when that higher precision is really needed. Finally, because of our precise knowl- 
edge of the asymptotics of the remainder, we can provide an excellent estimate of 
the error the algorithm makes when it does converge. 

The essential problem is that the accuracy of the algorithm is determined by 
its truncation error 0{z'^n^~'^) in equation (3.1), but before that truncation error 
becomes sufficiently small, it may be overwhelmed by the floating point error. Thus, 
we need to estimate both sources of error as accurately as we can, and compare 
them to decide whether our accelerated summation has converged to a prescribed 
accuracy. In the next two sections we discuss each of these errors in turn. 

3.1.1. Estimating truncation error. In many numerical problems the best estimate 
of truncation error may simply be the change between successive estimates, at least 
when that change begins to exhibit convergence. However, because we know from 
equation (3.1) precisely how the truncation error behaves asymptotically, we can 
provide a sharper estimate. Subtracting successive estimates gives: 

(3.6) s^:% - .i™) « Az-ri^-^ (^1 + 1^ - 1^ 

for some constant A. But in fact Az"n^^™ is the leading order of the truncation 
error, so this suggests that a more accurate estimate of the truncation error would 
be: 

(m) (m) 

(3.7) At,.!") = _^3±i^ 

z(l + i)-"-l 

In our implementation, however, we use the modified estimate: 

(m) (m) 

(3.8) 

Using this estimate greatly reduces the number of false positives (where the al- 
gorithm believes it has converged but has not) when inside the unit disk |z| < 1 
but with 3? A ^ (see section 3.2.2 for details). Note that it is only when |z| < 1 
that the real part of the critical exponent can be positive; at the branch point 
such series do not converge and the hypergeometric function is undefined there. 
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Likewise, even though the simpler estimate AtrSn"'' = (■'i'+i ~ si™'') has similar 
rates of convergence or non-convergence to those we will present in section 3.2.2, 
the estimated error is not nearly as accurate as with the estimate (3.8) above, and 
so we prefer (3.8) for that reason also. 

3.1.2. Estimating floating point error. The estimation of floating point error is more 
complex. We must be careful, because overestimation of the floating point error can 
be just as dangerous as underestimation: it will cause us to abandon a calculation 
that may well have been converging. The very simplest estimates — for instance, 
comparing sL™Vm to the floating point precision — are disastrous for all but the 
smallest ranges of parameters, so we consider a more detailed analysis. Examin- 
ing equation (3.3), we can identify three primary sources of potentially significant 
floating point error in the calculation of si™^ : 

(1) Subtractive cancellation in either the numerator or denominator of (3.3); 
in each we subtract two nearby fioating point numbers that we expect will 
only grow closer to each other as n increases. 

(2) Underflow in the calculation of wi™-*, particularly if 5RA ^ or \z\ <C 1. 

(3) Accumulated floating point error in s„ and s„+i, which are calculated re- 
cursively using: 

(3.9) Sn+l = Sn + tn, 

(3.10) tn+1 = zr{n)tn. 

We will need some notation for our analysis. We denote the computed value of 
some exact quantity x hy x. The absolute error between x and x is Ax, while the 
relative error is 5x, so that: 

(3.11) x = x + /:^x = x{l + 5x). 

Applying this to the formula (3.3) for and calling the numerator of that 

formula Mn and its denominator I?„, we get: 

(3.12) + A2?„)(si™) + Asl")) = (AA„ + AAA„) 
implying: 

(3.13) Asi") = ^(AA/; - si") A2?„). 

To derive an estimate from this formula, we must approximate the errors AA/'n 
and A2?„. We estimate the former by assuming that it is dominated by the error 
in calculating the partial sums: 

(3.14) /XN'n ~ As„|a;„+i| + As„+i|u;„|. 

We estimate AP„ in terms of the relative error of the remainder estimates, rather 
than absolute. That is because we wish to vary our estimate for the remainder 
error based on whether or not the calculation underflowed: 



(3.15) 5uJn 



if z"n'^ is a normalized floating point number. 



n 



otherwise. 



In this equation, tp is the machine precision at which a;„ is computed, and /norm is 
the smallest normalized number in that precision. This subtlety is needed because 
when LOn becomes sufficiently small, it will no longer be represented by a floating 
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point number with a relative accuracy of tp, but rather with a relative accuracy that 
decreases from tp down to zero. Such floating point numbers are called subnormal, 
and /norm IS the smallest floating point number that is still normalized (possessing 
the full relative precision of Ep). 

Of course, these estimates involve many approximations. For example, the rela- 
tive error in w„ is likely larger than given by (3.15) when it is a normalized number, 
as it is itself calculated by a non-trivial chain of floating point computations. But 
in those cases it is not a dominant source of error overall, and so (3.15) suffices. 
Combining equations (3.13-3.15), approximating unknown exact quantities by their 
computed equivalents when needed, and taking absolute values to allow for an upper 
bound on the error, we arrive at the final estimate used in our implementation: 

(3.16) Afpsl™) = |. ^ . I [As„ + <Sc.„+i" 

+ p„| As„+i -I- \s^^'^\5ujn 

This equation is not complete until we specify how we estimate As„. We choose 
a simple type of a posteriori estimate; for details, see chapter 3 of Higham's mono- 
graph [ ]. The advantage of such estimates over the more straightforward a priori 
estimates is that they take into account cancellation that occurs during the com- 
putation, and are therefore less prone to overestimate the error (though they can 
be more expensive to compute). 

Our estimate starts from equation (3.9) and treats t„ as an exact quantity, 
accounting only for the accumulation of error in the recursive sum in s„. Then it 
is not hard to show that the appropriate a posteriori estimate starts from Asq = 
and recursively calculates 

(3.17) As„+i = As„ + ep |s„+i| 

where again ep is the machine precision at which the computation is carried out. 

Of course, in reality f„ is also corrupted by ever-growing error, since it too 
is calculated recursively. It is possible to apply a similar a posteriori analysis and 
derive an estimate for As„ that takes this into account. However when this method 
was implemented, it tended to severely overestimate the error, greatly reducing the 
convergence rate and increasing the number of false negatives (where the algorithm 
does not believe it has converged, even though it has). At the same time it was 
more expensive to compute and even though it could terminate some non-converging 
cases more rapidly, its overall performance was slower, even for parameter ranges 
where a substantial majority of cases did not converge. Hence, for the rest of this 
paper we consider only the estimate (3.17) when estimating the error in the partial 
sums, and we use that in our overall estimate (3.16) for the floating point error in 
our accelerated sum. 



3.1.3. Final algorithm. With our estimates for truncation and floating point errors 
in hand, we can flnally state our complete algorithm, which seeks to approximate 
a generalized hypergeometric function to specified accuracy, or determine that this 
cannot be done without higher precision. It takes as input not only the hyper- 
geometric parameters and argument, but also the order m, the desired relative 
tolerance e, and a maximum number of iterations N . The errors are estimated as 
described in the preceding two subsections. In pseudocode we have: 
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Input: z, q, {aJi^i , {/^JLn "i, N, and e. 
Calculate: Exponent A and {ck}^SQ 
Initialize: sq, si, and S2; uji and 0-12! s^™-*; Asq 
do: Calculate s„+i, As„+i, and ujn+i 




From s„. s„+i. aj„ and ujn+i, calculate sj™]^ 
From s„, s„+i, w„, a;„+i, As„, and As„+i calculate Afpsi^ 
From si™' and calculate Atrsi'"'' 

n + 1, s„+i s„, As„+i -J> As„, and w„+i w„ 



else if rAfpSn > AtrS„ return "Insufficient precision" 
else return "Maximum iterations reached" 

Note in particular that the order of the return statements is such that if on the 
same iteration we simultaneously reach a tolerance less than the specified tolerance 
but also the floating point error exceeds the truncation error, we nonetheless deem 
the acceleration to have converged. This is the less conservative approach, but when 
these boundary cases are instead treated as failure, we greatly increase the number 
of false negatives, without avoiding any additional false positives. The parameter 
T is an empirical "fudge factor" to lower the estimate of the floating point error; 
in our final implementation it was set to 0.1 as this somewhat decreases the false 
negative rate. 

3.2. Testing the method. The algorithm that we have now derived and described 
in some detail was implemented in C, to test its effectiveness as a general purpose 
computational strategy for q^iF^i^ /3'i,..'.J^^ | z)- Of course we should like to val- 
idate it through testing, but we immediately face the problem of what to test it 
against, since the generalized hypergeometric function can be so challenging to 
compute in the cases we have in mind. 

One choice of course is to test the calculation of 2-Fi( "/j"^ | z) at the branch 
point, since that is simultaneously non-trivial for a series based computation, yet 
easily benchmarked against Gauss' formula (1.3). That indeed forms the bulk of 
our test suite, but we also examined some 3i^2 and 4^3 functions inside the circle 
of convergence. We compared these against the corresponding calculations from 
the Python package mpmath [ ], which as mentioned in the introduction takes a 
fairly sophisticated approach to calculating generalized hypergeometric functions. 
However it was still too slow (at high precision, which we used to ensure accuracy) 
to test 3F2 and 4^3 functions at the branch point for a large number of randomly 
generated cases. 

All of our test results below will be presented in terms of a parameter R, which 
sets the scale for choosing the parameters. The precise selection was as follows: 

Test cases in the unit disk: We choose a point z with uniform probability 
inside the disk \z\ < 1. We then choose q + 1 upper parameters and q lower 
parameters, with the real and imaginary parts of each chosen uniformly at 
random between —R and R. 

Test cases at the branch point: We choose the upper and all but one of 
the lower parameters with both real and imaginary parts chosen uniformly 
in (— i?, i?), but we choose the last lower parameter so that Sftcr < 0. To 
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do this, if the real part of the sum of the upper parameters less the sum 
of the already chosen lower parameters is less than zero, we simply choose 
the real part of the last lower parameter uniformly between that value and 
R. But if that sum is positive, we choose uniformly between that value and 
either R or that value plus O.IR, whichever is greater. So the last lower 
value may have a real part greater than R. 

With that procedure in mind, there are several questions we wish to investigate 
empirically: What is the optimal choice of order m? What is the accuracy of the 
method, and how often does it terminate correctly? How accurate is the estimation 
of error? What are optimal choices for the maximum number of allowed terms, 
A^? How fast is the method? We present results on all of these questions in the 
following sections; for most of these we only consider 2^i(l)- However when we 
examine the overall accuracy and termination we also consider cases in the unit 
disk, as described above. 

3.2.1. Convergence as a function of order. We begin with an investigation of the 
effect of the order on the convergence of the method. We have already seen in a 
simple example that the order does indeed dramatically affect the rate of conver- 
gence, yet at the same time higher order requires more precomputation and slows 
the execution of the method, so we only wish to invest in this when it is helpful. 
But if we choose too low an order all of the rest of our tests will be essentially 
meaningless. 

Thus, we considered q+iFg(l) for q ~ 1, 2 and 3. We generated 10'' random 
cases for each value of q and for each R in {1, 5, 10, 50, 100}. We required a relative 
tolerance of 2 x 10^^^, and then for each randomly chosen parameter set we called 
the algorithm with each value of m from 5 to 50. For each parameter set for which 
at least one of these calls converged, we then observed for which m the fewest 
number n of partial sums were needed to achieve the specified convergence; we 
called this the optimal order rriopt- We will not present all of the results here, but 
simply an illustrative example for the 4^3 functions; the results for other choices of 
q are similar. 

We can see from figure 3 that there is a dependence on R, but note that regardless 
of R the maximum optimal m is 45, even though we tested up to 50; at these 
ranges of R, at least, there is simply no point in using more than 45 asymptotic 
coefficients. Of course, this does not prove that using more coefficients will lead to 
faster execution, since the increased precomputation may offset the need for fewer 
partial sums. We will examine the speed of the algorithm in section 3.2.5. But 
we do conclude from this plot and the similar plots for 2F1 and 3^2 that 45 is a 
reasonable upper choice for the order; when we wish to compare the effect of order 
on other tests we will also present results for to = 30. 

3.2.2. Convergence as a function of the hypergeometric parameters. By far the most 
important question for our method is whether or not it converges, and whether or 
not it reliably determines when it has. Our most extensive testing was on this 
question. 

To study it, we again considered R e {1,5,10,50,100}, to = 30 and to — 45, 
and a relative tolerance e of either 1 x 10~^^ or 2 x 10~^*. For each permutation 
of these parameter choices, we generated 10^ random samples and compared the 
computed with the exact value of 2-Fi(l)- We considered four possible scenarios: 




Convergence: The algorithm claimed that it converged to the specified tol- 
erance, and the true relative error was within a factor of ten of that specified 
tolerance. 

False positive: The algorithm claimed that it had converged, but the true 
relative error was more than ten times the tolerance. 

No Convergence: The algorithm claimed it did not converge, and its true 
relative error was more than the tolerance. 

False negative: The algorithm claimed it did not converge, yet its true rel- 
ative error was less than the tolerance. 

The factor of ten is somewhat arbitrary but merely reflects the uncertainty in our 
truncation error estimate; in this preliminary investigation it is too stringent to 
demand that true error be strictly less than than the tolerance, though we will see 
in section 3.2.3 that this is almost always the case. Finally, in addition to these 
four scenarios (which are mutually exclusive) we also show the percentage of each 
sample that reached the maximum number of allowed iterations (2 x 10'' in this set 
of tests) without converging. These samples are not really cases that could have 
converged had they been given more time, but rather cases where the floating point 
error was underestimated and the algorithm did not terminate early with failure as 
it should have. 

For the case m = 45, these results are summarized in tables 2 and 3. We 
do not present the results for to = 30 since they differ very little from these; 
typically at most a percent. We can see from these tables that the convergence 
rate depends strongly on R, as we would expect from our earlier examples. But 
the false positive rate is extremely low; and even that overstates the issue: all of 
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Table 2. Accuracy of 2Fi{l) for m — and e = 1 x 10^^^ 



R 


C 


FP 


NC 


FN 


^max 


1 


100.0% 


0.0% 


0.0% 


0.0% 


0.0% 


5 


93.82% 


0.046% 


5.91% 


0.23% 


5.21% 


10 


78.89% 


0.073% 


20.24% 


0.80% 


13.18% 


50 


36.40% 


0.014% 


62.57% 


1.02% 


11.90% 


100 


22.62% 


0.006% 


76.56% 


0.81% 


9.13% 



Table 3. Accuracy of 2^^1(1) for m ~ 45 and e = 2 x 10 



R 


C 


FP 


NC 


FN 


^max 


1 


98.90% 


0.012% 


1.08% 


0.01% 


0.90% 


5 


81.39% 


0.096% 


17.97% 


0.54% 


14.64% 


10 


65.90% 


0.095% 


32.87% 


1.13% 


19.59% 


50 


29.96% 


0.019% 


68.66% 


1.36% 


12.26% 


100 


18.53% 


0.009% 


80.70% 


0.77% 


9.20% 



the cases labeled as false positives in fact converged but with a slightly higher ratio 
between the true and estimated error. Had we chosen our adjustment factor to be 
200 instead of 10, the false positive rate would be zero in all cases. Thus, at the 
branch point we conclude that when the algorithm terminates with success, it is 
essentially always reliable. The false negative rate is also low, though not nearly 
so small as the false positive rate. It is also somewhat ambiguous, since it tells us 
only that when the algorithm terminated with failure, it was in fact really within 
the prescribed tolerance; it does not identify scenarios where had the calculation 
continued further, convergence would have been achieved. A wide variation in this 
rate is therefore possible based on the choice of floating point error estimate; our 
implementation uses the choice that gave the smallest false negative rate of those 
estimates we considered. . Comparing table 2 to table 3, we can see that requiring 
higher accuracy does decrease the percentage of convergent cases, with an effect 
most pronounced for values of R in the middle of the range we considered; at high 
R, the percentage of convergent cases is small enough that most of the effect of the 
choice of e is masked. 

We also investigated the accuracy for points chosen in the unit disk. Here for 
simplicity we considered only to = 45 and e = 2x10"^"*, but we were able to examine 
3F2 and 4F3 functions as well, though for smaller ranges in R. These results are 
shown in table 4. We see that as we might expect the overall convergence rates are 
higher than for the corresponding value of R at the branch point, and though the 
data is somewhat limited there does not seem to be a strong dependence on the 
order q of the hypergeometric; certainly not nearly as strong as the dependence on 
R. One subtlety not shown in this table is that it is no longer true that the (albeit 
rare) false positives are necessarily benign; roughly half of the false positives for 
the R = 100 cases of 2F1, for instance, had fewer than half of the digits correct; 
in many cases no digits correct. This always happens when the exponent A has a 
large positive real part; as already mentioned in section 3.1.1, our choice of estimate 
for the truncation error reduces the fraction of these false positives. We can now 
quantify that assertion: had we used the truncation error estimate (3.7) instead 
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Table 4. Accuracy of q^iFq{z) calculations for \z\ < 1 with m — 
45 and e = 2 X 10"^'*. 





R 


C 


FP 


NC 


FN 


^max 




1 


99.86% 


0.002% 


0.14% 


0.0% 


0.0% 




5 


94.63% 


0.15% 


5.10% 


0.12% 


0.011% 




10 


85.98% 


0.13% 


13.70% 


0.19% 


0.036% 




50 


49.32% 


0.12% 


50.36% 


0.20% 


0.17% 




100 


31.73% 


0.11% 


68.03% 


0.13% 


0.11% 




1 


99.76% 


0.03% 


0.21% 


0.0% 


0.0% 


3F2 


5 


92.85% 


0.20% 


6.78% 


0.10% 


0.01% 




10 


84.88% 


0.13% 


14.83% 


0.16% 


0.03% 




1 


99.74% 


0.0% 


0.25% 


0.010% 


0.0% 


4F3 


5 


91.35% 


0.12% 


8.44% 


0.09% 


0.03% 



of (3.8), then even with an adjustment factor of one thousand instead of ten, our 
percentage of false positives for the \z\ < 1, R = 100 case of 2F1 would be 11.22%, 
or one hundred times greater. The false positives that still persist even using (3.8) 
are those where the terms of the series temporarily become much smaller (by many 
orders of magnitude) before again increasing. This causes the series to appear 
to be rapidly converging when it is not, and our acceleration method is unable 
to distinguish this from true convergence. But neither are other methods; even 
commercial algebraic packages were found to falsely return convergence on these 
cases, unless specifically instructed to calculate results to very high precision. We 
know of no reliable way of deciding that this will happen; we must just fortuitously 
choose to calculate at sufficiently high precision. 

In summary, except for those very few cases just mentioned, the method is highly 
effective at either accelerating the series or determining that a higher precision is 
needed; the error estimate it returns is almost always accurate to within a factor of 
ten. If the arguments to the function are themselves much larger than about ten, 
then it is increasingly unlikely that the method will converge in double precision, but 
it will correctly identify that failure, and a sufficiently high precision implementation 
should succeed. 

3.2.3. Accuracy of the error estimate. The results of the previous section clearly 
indicate that our truncation error estimate is quite reliable, but it is useful to 
consider this in more depth. Thus, from each of the trials that converged or were 
false positives in the tests of the previous section, we can examine the ratio of the 
true relative error to that estimated by the algorithm from equation (3.8). As an 
example of this behavior, we show in figure 4 a relative cumulative frequency plot 
for each of the five values of R we tested, for the 2-P'i(l) function with m = 30 and 
e — 2x 10^^"*. As this figure shows, the error estimate is excellent and only weakly 
depends on R; in 90% or more of cases the estimated error is less than the true 
error, and for essentially all cases it is within a factor of two. 

3.2.4. Number of partial sums needed. Another important parameter of the method 
is the maximum number of iterations allowed before the method is deemed to have 
failed, regardless of the error estimate. Since up to 20% of cases may reach this 
limit, it is important not to have it unnecessarily high, as otherwise we waste time 
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Figure 4. The cumulative relative frequency of the ratio of the 
true error etruo to the estimated error among those of the randomly 
chosen calculations that converged, when the order m is 30 and the 
requested relative tolerance is 2 x lO"^**. 



on an unsuccessful calculation. A sample of such behavior — again as a cumulative 
relative frequency plot — is shown in figure 5; results for other choices of m or e 
are similar. Unlike the error ratio, we see a much stronger dependence on the size 
of the parameters, as measured by R, but we can see that at least for R up to 
100 choosing the maximum N to be about one thousand is quite conservative; for 
smaller parameter ranges this can be reduced even further. 

3.2.5. Speed of the algorithm. Finally, apart from the accuracy of the algorithm its 
most essential characteristic will be its speed. To examine this, we generated one 
thousand sets of parameters for each of our usual set of five R values, and then 
averaged the time of each parameter set from one hundred runs on a 1000 MHz 
AMD Athlon processor under Linux. We report separately the time for all runs 
versus just those runs that converged, and we set the relative tolerance at 2 x 10""'^'' 
and the maximum number of allowed iterations at 2 x 10^. For evaluations at the 
branch point, we obtain the results shown in table 5. 

We see from table 5 that as we would expect, increasing the order of the algorithm 
does increase the execution speed, but more so for the smaller parameter choices, 
where the algorithm converges quite quickly and there is little to be gained by using 
more asymptotic coefficients. Within a given order m, we also see the time increase 
with R, by roughly a factor of two as we move from R = 5 to R — 100. The results 
for cases inside the unit disk are similar, except that the increase in running time 
as we increase m is more significant; at least a factor of two for all values of R, not 
just small values. 
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Figure 5. The cumulative relative frequency of the number of 
terms n that were needed among those of the randomly chosen 
calculations that converged, when the order ni is 30 and the re- 
quested relative tolerance is 2 x 10~^^. 

Table 5. Speed of the algorithm at the branch point, when N — 
2 X 10^ and e = 2 x 10"". 





m = 30 


TO = 45 


Function 


R 


All 


Only converged 


All 


Only converged 




1 


0.154 ms 


0.0951 ms 


0.307 ms 


0.250 ms 




5 


0.675 ms 


0.115 ms 


0.858 ms 


0.292 ms 


2J^l(l) 


10 


0.911 ms 


0.145 ms 


1.15 ms 


0.339 ms 




50 


0.982 ms 


0.332 ms 


1.25 ms 


0.505 ms 




100 


1.11 ms 


0.539 ms 


1.38 ms 


0.728 ms 




1 


0.119 ms 


0.109 ms 


0.338 ms 


0.330 ms 




5 


0.680 ms 


0.124 ms 


0.843 ms 


0.307 ms 


3^^2(1) 


10 


0.966 ms 


0.157 ms 


1.19 ms 


0.323 ms 




50 


1.15 ms 


0.330 ms 


1.39 ms 


0.485 ms 




100 


1.21 ms 


0.406 ms 


1.46 ms 


0.546 ms 




1 


0.118 ms 


0.0981 ms 


0.322 ms 


0.310 ms 




5 


0.726 ms 


0.118 ms 


0.855 ms 


0.258 ms 


4^3(1) 


10 


1.05 ms 


0.148 ms 


1.29 ms 


0.308 ms 




50 


1.21 ms 


0.286 ms 


1.46 ms 


0.438 ms 




100 


1.33 ms 


0.381 ms 


1.58 ms 


0.546 ms 
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4. Comparison with other methods 

While the tests of the previous section are a convincing vaHdation of the method 
of this paper, it is also important to compare the method to others. Here we are 
somewhat hampered: most of the literature on computing the generalized hyper- 
geometric function gives only a few examples, and not the kind of large, randomly 
selected test cases we used for testing. Also interest is often focused on obtaining 
the greatest accuracy with the fewest terms summed, and robustness considera- 
tions (such as automatic termination) are rarely mentioned. For these reasons, in 
this section we use a different implementation of our algorithm, in Python, using 
mpmath ["] to provide arbitrary precision arithmetic. By using an interpreted 
language and software implementations of higher than machine precision, we of 
course pay a large performance penalty (typically a factor of roughly a thousand). 
However most of the other methods we consider here are also implemented in such 
systems, so a comparison between the two is still meaningful. 

Moreover, our interest here is centered on methods that can handle generic, 
complex parameters. There can certainly be particular choices of parameters (and 
argument) for which other methods are more efficient, but when all of the parame- 
ters and the argument are permitted to be complex, the number of possible "special 
cases" grows bewilderingly large. So we focus on two methods that are proposed as 
general-purpose algorithms for hypergeometric functions, and also on the _E-method 
(mentioned in section 2.1) which bears superficial similarity to the method of this 
paper. 

4.1. Zeta function acceleration. This method [26, 27] is designed to evaluate 
q+iFq{l), and as such is perhaps still something of a special-purpose algorithm. But 
the branch point is the most challenging case, and the authors consider complex 
parameters as well as several optimizations of their method. That method is based 
on directly summing the first N terms of the series, and then approximating the 
remainder in terms of m Hurwitz zeta functions. The method of i ] extends that of 
[2()] by allowing for a complex parameter a that is determined through a symbolic 
algebra problem requiring the solution of a nonlinear optimization through Grobner 
bases. The complexity of this optimization problem grows with m, so the authors 
do not consider values of m as large as those we can easily handle with our method. 

In the optimized work [27], a few examples are considered, and here we compare 
two of them with our method. The timing results of ]27] were for an AMD Athlon64 
3500+ processor; our processor is comparable if a little faster (it is a 4600+ model). 

The first example considered in [27] are 3^2(1.6 + 7i, 2.4 - i, ^2; 3 + i, V6 + 1). 
The method of that paper is able to evaluate that function to ten digit accuracy 
with TV = 35 in 0.3 seconds; our method required = 17 and 0.37 seconds. But 
as the required precision is increased, the advantage of our method grows: 15 digit 
accuracy with the zeta-function method required A^ = 100 and 1.1 seconds, but for 
our method A^ = 25 and still 0.37 seconds; 35 digit accuracy required A^ = 3500 
and 14 seconds for them, but A^ = 110 and still just 0.37 seconds for us. 

In fact in every case considered in [27] our method out-performed that method, 
often substantially; to keep our discussion brief we consider just one more example. 
The most challenging example considered in that paper was 4^3(2.4 + 30i, —0.3 + 
0.5j,2.2 - 1,0.5 + i] 1.8, 1.1 - i,2 + 17i; 1). The authors could achieve 10 digit 
accuracy with roughly 1000 terms summed, whereas we achieve the same with only 
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136 terms. To achieve 20 digit accuracy they required approximately 6000 terms, 
whereas we needed only 290. It is true that we can use a larger value of m than 
those authors (we used 30 in our tests; they used either 10 or 15) but that is again 
because it is easy for us to increase m, as there is no system of m polynomials to 
solve in our method. 

4.2. Euler-Maclaurin summation. Euler-Maclaurin summation is based on a 
specific analytic form of the remainder of a series, expressed as an integral and a 
weighted sum of derivatives of the terms with respect to the term index [61]: 

(4.1) p„ = y^^^t(fc)dfc+-t(„+i)-^^i(2.-i)(„+i). 

Here the B2j are the Bernoulli numbers, and we have assumed that not only do 
the terms t^. go to zero as fc — cx), but so also do all of the derivatives of the terms 
with respect to the term index; the method is easily generalized when that does 
not hold. 

It is this need to integrate and differentiate terms with respect to the term index 
that makes this method challenging. For the Riemann zeta function, the authors of 
[(i l] could carry this out analytically, but for generalized hypergeometric functions 
an analytic solution seems intractable. However a numerical implementation of this 
method underlies the mpmath [21] calculation of generalized hypergeometric func- 
tions near the branch point, so we make our comparison with that implementation. 

The first test case considered in ] ] is 4^3(1, 1, |, 2; |, ^p, ^; 1). For mpmath 
and 25 digit accuracy, its Euler-Maclaurin based summation method requires 2.4 
seconds, while on the same machine our method requires only 0.5 seconds. But if 
we try to extend the test cases of the previous subsection, then the Euler-Maclaurin 
based approach is completely incapable of grappling: the first and simplest test case 
we considered runs for several minutes before returning a failure to converge. It 
fares even worse in the other test cases. 



4.3. S-method. Unlike the method of this paper, the previous two methods re- 
quire the computed partial sum only so they can add their estimates of the remain- 
der to that partial sum; the remainder itself they calculate without direct reference 
to the sequence of partial sums. Our method requires two successive partial sums, 
because both s and /i in equation (3.1) are unknown; we are effectively solving a 
2x2 linear equation. At the other extreme, we could use only our knowledge of the 
leading behavior of the remainder, and rather than precomputing the asymptotic 
coefficients c^, we can determine the coefficients in: 

(4.2) s„ = s-F VSfc— 3-. 

k=l 

Here the coefficients are related to our asymptotic coefficients c^, through — 

This approach is the i?-method already mentioned in the introduction, for the 
particular choice of functions gk{n) = /n^^^^^ . That method is described in the 
monograph [42] of Brezinski and Redivo Zaglia and was independently discovered 
by Schneider [64] , Havie ] ] and Brezinski ] ] ; a stable numerical implementation 
is described by Brezinski in ] ]. 
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Figure 6. The relative error (5si™'' for different orders m of the 
i?- method appHed to (3.4) . 



In fact, the i?-method is perhaps more properly thought of as a class of methods; 
most existing methods can be subsumed by specializing to a particular choice of 
the gkin). Indeed Levin's original work [47] on nonlinear sequence transformations 
can be analyzed as a specialization of the _E-method to an asymptotic expansion in 
inverse powers, with differing simple remainder estimates that enable application 
to a variety of different sequences. The algorithm we analyze now is another such 
specialization, where the remainder estimate is given by our analytic knowledge of 
only the leading order of the asymptotic truncation error. Thus the comparisons of 
this section can also be considered a comparison to a variant of Levin's methods. 

Not only is such a specification necessary to have a complete algorithm, but 
particular specializations will often allow simpler implementations than those de- 
scribed in the references above for the general-purpose i?- method. In our present 
case, the special form of our makes it simpler to use the recursive scheme in 
section 7.2 of Weniger's review article [39], itself based on the work of Fessler et al. 
in [65]. 

We compare first the complexity of the two approaches. If we assume that m ^ q, 
then the complexity of the algorithm of this paper is roughly ^m^ -I- lOm^ + (2m + 
7)N, while for this implementation of the i^-method it is Qrn?N . Thus, as N grows 
beyond to, the method of this paper has clearly better complexity; even when 
TO w iV it is somewhat superior. Note that the i^-method cannot have N < m, 
since we must always consider at minimum m -I- 1 partial sums. 

The real advantage of our method, however, is in its stability. To illustrate that, 
we return to the first example we considered, from equation (3.4) and as shown in 
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figure 1. Figure 6 sliows tlie corresponding plot for the method using calculations 
in 80 bit (long double) precision. Comparing to figure 1, we note several differences: 

(1) The overall error is larger. 

(2) The floating point error grows much more quickly with the order m of the 
transformation. 

(3) Once the minimum error is reached, the error rapidly begins increasing 
again, so that automatic termination would be much more challenging to 
implement. 

We can understand this overall loss of stability if we consider that the i?-method 
is essentially solving the linear system: 



(4.3) 



1 z"/n-^ 

1 z"+V(n+l)--^ z"+V(n+l)i-^ 



z"+V(n + 1)'"-!-^ 



1 z"+"/(n + m)-^ z"+"V(n + m)i--^ ... 




-f m 


m — 1 — A 




s 




Sn 




Cl 




Sn+1 


X 

















Of course, this system is not explicitly solved at each step, since it is only s that 
we need, but it is still the stability of the underlying system (4.3) that dictates 
the stability of the recursive scheme for s. We can quantify that through condition 
numbers. Examining figure 1, we see that we would expect to achieve 10 digit 
accuracy when n = 10 if our order is m = 30. Yet the condition number of the 
matrix of (4.3) for those choices is 1.3 x 10^®, far too large to allow a solution in 
long double precision. If we try to avoid this by decreasing m, then we must also 
increase n; again from figure 1 we estimate that if rn = 15 we would need roughly 
n = 40; now the condition number is 3 x 10^^. 

These large condition numbers are not coincidental. We saw that our method 
begins to lose precision whenever \s\ ^ since then the unknown that we care 
about in our linear system becomes much smaller than the other unknown; this 
must spring from instability in the underlying (2 x 2) system we are solving. With 
the .E-method, the same problem can arise if \s\ <C for any of the that we 
solve for. But that will happen generically: the coefficients are asymptotic and 
grow rapidly with m; for the example we consider here we have |c3o| — 10^^. Hence 
the corresponding linear systems must be unstable. 

From this perspective, the chief advantage of the method of this paper is that it 
bypasses such an unstable linear system. Instead, as we saw in section 2.4, we can 
determine the a;„ to high accuracy, and that accuracy does not decrease rapidly 
with TO, and actually increases with n. For this reason, our method is much more 
stable. 



5. Conclusions 

Summarizing, we have shown that it is possible to derive the asymptotics of the 
remainders of the partial sums of the generalized hypergeometric function q+iFg 
to any desired order in inverse powers of n. We have given explicit formulas for 
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the remainders in terms of the hypergeometric parameters and argument. This 
analytic result is the basis for a new series acceleration technique that can dramat- 
ically accelerate the convergence of the generalized hypergeometric series, making 
it feasible to evaluate these for complex arguments, even at the branch point z = 1. 
As implemented in C, the algorithm can be limited by the fixed precision of stan- 
dard floating point types, but even in this case the precise asymptotic knowledge 
available enables us to determine correctly when the acceleration has converged. 
The method seems much more efficient and robust than any others we have found 
in the literature that are applicable to q+iFq at generic complex arguments and 
parameter. 

There are still some open issues, which are natural starting points for future 
research: 

• At present, the algorithm is very slow near the branch point, much more so 
than at the branch point. As shown by Biihring [.Hd-:]:!] and N0rlund [^4], 
there is a close connection between the asymptotics of the partial sums at 
the branch point and the behavior of the function near the branch point. It 
would be interesting to see if this can be leveraged to evaluate the function 
near the branch point using the (faster) evaluation at the branch point; of 
course this is not just a simple series expansion precisely because we are 
near a singularity. 

• As we have noted throughout this paper, our C implementation is limited 
by fixed fioating point precision. Of course it is straightforward to imple- 
ment the algorithm in any of the free or commercial symbolic computational 
programs that support arbitrary precision, but it could also be useful to 
continue development of an arbitrary precision routine in a low-level lan- 
guage, by taking advantage of existing higher precision libraries like MPFR 
[66] or QD [67]. Moreover, the acceleration presented here is almost cer- 
tainly not ideal for all inputs; for small \z\, for instance, there is no need to 
use any acceleration at all. Even when the method of this paper is best, a 
more automatic implementation should choose the order m and maximum 
iterations N to minimize the computation required to achieve the desired 
accuracy. All of these considerations together could lead to a reliable and 
fast library for generalized hypergeometric evaluations. 

• We have focused in this paper on the case p = q + 1 because that restriction 
was needed to apply the results of [IS]. However the general case can be 
handled by including the further results of those same authors in [V.)]; work 
on this extension is already underway. 

• We have also only described an algorithm in which we use the precise re- 
mainder estimates and a constant correction factor fj,, in contrast to tra- 
ditional series acceleration techniques that use simple remainder estimates 
and sophisticated correction factors. But the two choices are not exclusive, 
and it would be interesting to investigate the performance of a method that 
combines the remainder estimates of this paper with the correction factors 
fj,n of traditional series acceleration techniques. 

• We have limited ourselves to asymptotic expansions in inverse powers of 
n, because those are the asymptotic functions in which the results of ]")<s] 
are couched. However, Weniger has found [28, 68] that inverse factorial 
series can also be very useful — in some cases much more powerful — than 
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inverse powers, and it is worth investigating if a similar expansion would 
be effective here. In particular, it is shown in [28, 68] than an expansion 
in inverse powers can be transformed to an expansion in inverse factorial 
series, so the question is really how the stability and efficiency of such a 
scheme compares with the method presented here. 
• Finally, it is worth investigating how well the method of this paper performs 
when applied to other functions for which an analytic asymptotic expansion 
of the term ratio is available. Though many series of practical interest in 
the sciences are available only numerically or as expensive computations 
(e.g., perturbation series), there are still other series of practical interest 
where we have available the necessary analytic knowledge, and would like 
to take advantage of that knowledge to efficiently and robustly evaluate the 
functions. 
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